A bottom-up approach for handling unseen triphones in large vocabulary continuous speech recognition

نویسندگان

  • Xavier L. Aubert
  • Peter Beyerlein
  • Meinhard Ullrich
چکیده

This paper presents an extension of bottom-up state-tying towards improved handling of unseen triphones. As opposed to the usual backing-o to diphones and monophones, the current method aims at nding a triphone model that has proven to exhibit some similarity with the unseen triphone. It is based on a probabilistic mapping of unseen contexts to clusters of triphone-states observed in the training data. This algorithm has been applied to dictation tasks for three languages with vocabulary sizes ranging from 20k to 64k. The results compare favorably with those obtained using standard back-o rules. This technique also o ers an alternative to top-down decision-tree procedures which are frequently used especially for their generalization capabilities.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting Unseen Triphones with Senones - Speech and Audio Processing, IEEE Transactions on

In large-vocabulary speech recognition, we often encounter triphones that are not covered in the training data. These unseen triphones are usually backed off to their corresponding diphones or context-independent phones, which contain less context yet have plenty of training examples. In this paper, we propose to use decision-tree-based senones to generate needed senonic baseforms for these uns...

متن کامل

Predicting unseen triphones with senones

In large-vocabulary speech recognition, the decoder often encounters triphones that are not covered in the training data. These unseen triphones are usually represented by corresponding diphones or context independent monophones. We propose to use decision-tree based senones to generate needed senonic baseforms for unseen triphones. A decision tree is built for each individual Markov state of e...

متن کامل

Decision tree-based triphones are robust and practical for mandarian speech recognition

In large-vocabulary, speaker-independent speech recognition systems, modeling of vocabulary words by subword units is mandatory. This paper studies the use of triphone units for Mandarin speech recognition compared to biphone and context-independent phonetic units. In order to solve unseen triphones in speech recognition, decision-tree based clustering is used in triphone units. This method ach...

متن کامل

Phonetic Modelling in the Philips Chinese Continuous Speech Recognition System

We have extended the Philips large vocabulary continuous speech recognition system towards Chinese On the way from our existing Western language technology to Mandarin the rst step was to build a suitable phonetic model This paper describes the development of our phonetic model excluding tones for Mandarin Chinese We will present a systematic comparison of three forms of sub syllabic units for ...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996